Extracting Relevant and Trustworthy Information from Microblogs
نویسنده
چکیده
Microblogging sites like Twitter have emerged as a popular platform for exchanging real-time information on the Web. Twitter is used by hundreds of millions of users ranging from popular news organizations and celebrities to domain experts in fields like computer science and astrophysics and spammers. As a result, the quality of information posted in Twitter is highly variable and finding the users that are authoritative sources of relevant and trust-worthy information on specific topics (i.e., topical experts) is a key challenge. I will attempt to address this challenge in this two-part talk. In the first part of the talk, I will focus on understanding and combating link farming activity in Twitter. Users, especially spammers, resort to link farming to acquire large numbers of follower links in the social network. Acquiring followers not only increases the size of a user’s direct audience, but also contributes to the perceived influence of the user, which in turn impacts the ranking of the user’s tweets by search engines. I will first discuss results from our recent studies investigating link farming activity in the Twitter network and then propose mechanisms to discourage the activity. In the second part of the talk, I will focus on the problem of finding topic experts in Twitter. I will propose a new methodology that relies on the wisdom of the Twitter crowds. Specifically, we leverage Twitter Lists, which are often carefully created by individual users to include experts on topics that interest them and whose meta-data (List names and descriptions) provide valuable semantic cues to experts’ domain of expertise. I will first describe how we mined List information to build Cognos, an expert search system for Twitter and then present results from a real-world deployment.
منابع مشابه
A rapid-prototyping framework for extracting small-scale incident-related information in microblogs: Application of multi-label classification on tweets
Small scale-incidents such as car crashes or fires occur with high frequency and in sum involve more people and consume more money than large and infrequent incidents. Therefore, the support of small-scale incident management is of high importance. Microblogs are an important source of information to support incident management as important situational information is shared, both by citizens an...
متن کاملNews Feature Extraction for Events on Social Network Platforms
Microblog-based social network platforms like Twitter and Sina Weibo have been important sources for news event extraction. However, existing works on microblog event extraction, which usually use keywords, entities, or selected microblogs to represent events, are not able to extract details of an event. Based on the view of news report, an event should present detailed news features, i.e., whe...
متن کاملA Corpus for Entity Profiling in Microblog Posts
Microblogs have become an invaluable source of information for the purpose of online reputation management. Streams of microblogs are of great value because of their direct and real-time nature. An emerging problem is to identify not only microblog posts (such as tweets) that are relevant for a given entity, but also the specific aspects that people discuss. Determining such aspects can be non-...
متن کاملInformation Extraction from Microblogs
The micro blogging sites contain the emotion and expression of the public in raw format. The data can be used to extract much meaningful information that could be used to develop technologies for future use. There are numerous micro blogging sites available these days that are used in different contexts. Some are used basically for conversation, some for image and video sharing, and some for fo...
متن کاملOverview of the FIRE 2016 Microblog track: Information Extraction from Microblogs Posted during Disasters
The FIRE 2016 Microblog track focused on retrieval of microblogs (tweets posted on Twitter) during disaster events. A collection of about 50,000 microblogs posted during a recent disaster event was made available to the participants, along with a set of seven practical information needs during a disaster situation. The task was to retrieve microblogs relevant to these needs. 10 teams participat...
متن کامل